Impacts of Climate Change on Mortality: An extrapolation of temperature effects based on time series data in France

This R Markdown Notebook reproduces the numerical applications presented in the paper “Impacts of Climate Change on Mortality: An extrapolation of temperature effects based on time series data in France”, available here.

The objective of this Notebook is to illustrate the estimation techniques and the presentation of results described in this article. Since daily mortality data is not publicly available, the estimates obtained from the DLNM model are presented, but cannot be reproduced. However, we present the functions to be applied, enabling the reader to replicate the process with their own data.

For the sake of brevity and code clarity, we limit the presentation to the aggregated results in Metropolitan France only. We also consider only the scenarios related to one climate model instead of the 12 presented in this paper. This simplification does not limit the understanding of the code or the reproducibility of the results.

Note that we use: - an adjusted version of the MultiMoMo package. Some functions have been re-coded or added to better suit our application. - the dlnm package and an adjusted version of the attrdl function initially developed by Antonio Gasparrini.

1 Set-up

First, we set up the working environment and load the necessary functions and packages for the application.

# Folders
fold <- getwd()
fold_data <- paste0(fold, "/data/")
fold_bib <- paste0(fold, "/functions/")
# Load functions
for (f in list.files(fold_bib))
  source(paste(fold_bib,f,sep=""), encoding = "UTF-8")
#----------------------------------------------------------
# Load packages
invisible(lapply(c(
  "MultiMoMo",
  "HMDHFDplus",
  "systemfit",
  "lattice",
  "grid",
  "ggplot2",
  "gridExtra",
  "locfit",
  "scales",
  "lubridate",
  "splines",
  "mgcv",
  "dlnm",
  "MASS",
  "mvtnorm",
  "Rfast",
  "parallel",
  "data.table",
  "formattable",
  "RColorBrewer",
  "readxl",
  "kableExtra",
  "dplyr",
  "tidyr",
  "tidyverse",
  "rmarkdown",
  "ggthemes",
  "cowplot",
  "ggridges",
  "viridis",
  "hrbrthemes",
  "colorspace",
  "ggbeeswarm",
  "ggpubr"),
  instal.import.package))
options(dplyr.summarise.inform = FALSE)
#----------------------------------------------------------
# Graphical parameters
theme_set(theme_bw())
trellis.device(color = FALSE)
# Forecasting period
start_year <- 2020
end_year <- 2100
year_breaks <- c(1980, 1989, 1999, 2009, 2019, 2029, 2039, 2049, 2059, 
                 2069, 2079, 2089, 2100)
year_labels <- c("1980s", "1990s", "2000s", "2010s", "2020s", "2030s", 
                 "2040s", "2050s", "2060s", "2070s", "2080s","2090s")

# Heatwave parameter - french definition
min_level <- 18
max_level <- 30

# Monte-Carlo simulations
nsim <- 1000 # number of simulations
parallel <- "no" # parallel option
ncpus <- 1L # number of cores

# Other parameters
AGE_MAX <- 105

2 Load data

In this section, we start by loading the input data. Three types of information are considered: historical mortality data (including daily temperatures and daily deaths), annual mortality data, and climate scenarios. More information regarding the data description is available in the paper.

2.1 Daily Mortality data

We import daily data series for death counts (which are not freely available) and temperatures. This dataset is constructed by merging:

  • daily mortality data,
  • daily temperature data.
daily_data <- get(load(file =  paste0(fold_data, "/daily_data.RData")))

# Summary of the data - list with 2 items for females (`f`) and males (`m`)
head(daily_data)
## $m
##           datedec age_bk nb_deaths years     tmax    tmax3        tmin
##            <Date> <char>     <num> <int>    <num>    <num>       <num>
##     1: 1980-01-01   0-64       265  1980 5.842857 6.464286  0.22142857
##     2: 1980-01-01  65-74       190  1980 5.842857 6.464286  0.22142857
##     3: 1980-01-01  75-84       246  1980 5.842857 6.464286  0.22142857
##     4: 1980-01-01    85+       125  1980 5.842857 6.464286  0.22142857
##     5: 1980-01-02   0-64       276  1980 3.742857 5.173810 -0.95000000
##    ---                                                                
## 58436: 2019-12-30    85+       324  2019 8.085714 8.195238  0.02142857
## 58437: 2019-12-31   0-64       145  2019 7.285714 7.278571  0.22142857
## 58438: 2019-12-31  65-74       175  2019 7.285714 7.278571  0.22142857
## 58439: 2019-12-31  75-84       215  2019 7.285714 7.278571  0.22142857
## 58440: 2019-12-31    85+       313  2019 7.285714 7.278571  0.22142857
##            tmin3     tavg    tavg3  time   dow month
##            <num>    <num>    <num> <num> <int> <int>
##     1: 1.8190476 2.289286 3.801190     1     3     1
##     2: 1.8190476 2.289286 3.801190     1     3     1
##     3: 1.8190476 2.289286 3.801190     1     3     1
##     4: 1.8190476 2.289286 3.801190     1     3     1
##     5: 0.2261905 1.550000 2.483333     2     4     1
##    ---                                              
## 58436: 2.0380952 3.271429 4.640476 14609     2    12
## 58437: 0.3928571 3.464286 3.378571 14610     3    12
## 58438: 0.3928571 3.464286 3.378571 14610     3    12
## 58439: 0.3928571 3.464286 3.378571 14610     3    12
## 58440: 0.3928571 3.464286 3.378571 14610     3    12
## 
## $f
##           datedec age_bk nb_deaths years     tmax    tmax3        tmin
##            <Date> <char>     <num> <int>    <num>    <num>       <num>
##     1: 1980-01-01   0-64       122  1980 5.842857 6.464286  0.22142857
##     2: 1980-01-01  65-74       117  1980 5.842857 6.464286  0.22142857
##     3: 1980-01-01  75-84       280  1980 5.842857 6.464286  0.22142857
##     4: 1980-01-01    85+       271  1980 5.842857 6.464286  0.22142857
##     5: 1980-01-02   0-64       116  1980 3.742857 5.173810 -0.95000000
##    ---                                                                
## 58436: 2019-12-30    85+       552  2019 8.085714 8.195238  0.02142857
## 58437: 2019-12-31   0-64       107  2019 7.285714 7.278571  0.22142857
## 58438: 2019-12-31  65-74        98  2019 7.285714 7.278571  0.22142857
## 58439: 2019-12-31  75-84       181  2019 7.285714 7.278571  0.22142857
## 58440: 2019-12-31    85+       514  2019 7.285714 7.278571  0.22142857
##            tmin3     tavg    tavg3  time   dow month
##            <num>    <num>    <num> <num> <int> <int>
##     1: 1.8190476 2.289286 3.801190     1     3     1
##     2: 1.8190476 2.289286 3.801190     1     3     1
##     3: 1.8190476 2.289286 3.801190     1     3     1
##     4: 1.8190476 2.289286 3.801190     1     3     1
##     5: 0.2261905 1.550000 2.483333     2     4     1
##    ---                                              
## 58436: 2.0380952 3.271429 4.640476 14609     2    12
## 58437: 0.3928571 3.464286 3.378571 14610     3    12
## 58438: 0.3928571 3.464286 3.378571 14610     3    12
## 58439: 0.3928571 3.464286 3.378571 14610     3    12
## 58440: 0.3928571 3.464286 3.378571 14610     3    12
# Get quantiles for extreme hot and cold of average daily temperature
q025 <- quantile(daily_data$m$tavg, 0.025)
q975 <- quantile(daily_data$m$tavg, 0.975)

The object daily_data is a list containing two datasets for females (f) and males (m). Each dataset contains:

  • datedec: the date,
  • age_bk: age buckets,
  • nb_deaths: number of deaths,
  • years: years,
  • tmax: maximum temperature of the day
  • tmax3: maximum temperature over the last 3 days
  • tmin: minimum temperature over the last 3 days
  • tmin3: minimum temperature over the last 3 days
  • tavg: average temperature of the day
  • tavg3: average temperature over the last 3 days
  • time: day number as a integer number
  • dow: day of the week as a integer number
  • month: month number

Our DLNM model for nb_deaths is fitted based on the tavg variable, as well as datedec, dow and month.

2.2 Annual mortality data from the HMD

The annual mortality data considered are extracted from the Human Mortality Database . We downloaded the exposure to risk and death counts from the HMD website. We consider data from Metropolitan France for the calibration period \(\mathcal{T}_y = \left\lbrace 1980, \ldots, 2019 \right\rbrace\) and the age range \(\mathcal{X}\)\(= \left\lbrace 0, \ldots, 105 \right\rbrace\).

mort_dt <- get(load(file =  paste0(fold_data, "/mort_dt.RData")))

2.3 Data from climate models

We import daily data series for climate trajectories. The models selected in the paper are the 12 models available through the portal for RCP2.6, RCP4.5, and RCP8.5 scenarios. In this Notebook, we only consider the model CNRM-CM5 / ALADIN63 with the RCP8.5 scenario.

clim_model <- get(load(file = paste0(fold_data, 
                                     "ALADIN63_CNRM-CM5_RCP8.5.RData")))

3 Estimating the DLNM model

The distributed lag non-linear generalized model (DLNM) is a reference standard in climate epidemiology for assessing the impact of delayed environmental factors on a response variable, see e.g. A. Gasparrini, Armstrong, and Kenward (2010) or Antonio Gasparrini (2014).

We fit DLNM models for each sex \(g\) and each age bucket \(k\). This model aims to assess the influence of temperatures on the daily number of deaths.

Let \(\lambda_{k,t,d}^{(g)} = \mathbb{E}\left({D_{k,t,d}^{(g)}}\right)\) represents the expected number of deaths for day \(d\), year \(t\) and sub-group \(k\). \(\vartheta_{d,t}\) denotes the daily average temperature of day \(d\) for year \(t\). The number of daily deaths \(D_{k,t,d}^{(g)}\), aggregated by \(K\) age groups and by sex \(g\), for each day \(d \in \mathcal{D}^\star = \left\lbrace 1,2,\ldots, 365, (366)\right\rbrace\) of year \(t\) is modeled by \[ \ln(\lambda_{k,t,d}^{(g)})=\eta_k^{(g)} + s(\vartheta_{d,t}, L; \boldsymbol{\theta}_k^{(g)} ) + \sum_{p=1}^{P}{h_p( z_{d,p} ; \boldsymbol{\zeta}_{k,p}^{(g)})}, \] where:

  • \(s (\vartheta_{d,t} , L ; \boldsymbol{\theta}_{k}^{(g)}) = \sum_{l=0}^{L}{f\cdot w(\vartheta_{d-l,t},l;\boldsymbol{\theta}_k^{(g)})}\) is a cross-basis non-linear function, capturing the cumulated effect of the daily average temperature \(\vartheta_{d,t}\) over a maximum of \(L\) days where \(f\cdot w \left( \cdot, \cdot \right)\) is a bi-dimensional exposure–lag–response function,
  • \(h_p( z_{d,p} ; \boldsymbol{\zeta}_{k,p}^{(g)})\) are smooth univariate functions of categorical time variables \(z_{d,p}\), e.g. day, day of the week, month or year. They capture residual seasonal effects, demographic shifts, or other long-term trends not accounted for by the other covariates.

We consider a bi-dimensional spline function \(s(\cdot, \cdot)\) as the surface of Relative Risk (RR). All DLNM models are calibrated based on daily average temperature data over the period 1980-2019, using the dlnm package (A. Gasparrini 2011). We consider for the exposure–response curve a natural cubic B-spline with three internal knots that are placed at the 10th, 75th, and 90th percentiles of the daily temperature distribution within the observational period. We also use a natural cubic B-spline for the lag-response curve with an intercept and three internal knots placed at evenly spaced intervals on the log scale. A maximum of \(L=21\) days is considered to account for delayed effects of both cold and hot temperatures.

3.1 Parameters

# Segmentation parameters
list_sexe <- list(m = 1, f = 2)
age_breaks <- c(0, 64, 74, 84, Inf)
age_labels <- c("0-64", "65-74", "75-84", "85+")

# Training period
annee_deb <- 1980
annee_fin <- 2019

# DLNM Parameters
param_dlnm_whole <- list(
  # main model, cubic natural spline with three internal knots in
  # the 10th, 75th, 90th percentiles of the temperature distribution
  varfun = "bs",
  vardegree = 2,
  varper  = c(10, 75, 90),
  ## specification of the lag function
  # Definition of the maximum lag, that is, 21 days
  lag  = 21,
  lagnk  = 3,
  ## degree of freedom for seasonality
  dfseas  = 8,
  ## degree of freedom for trend
  dftrend = NULL
  )

3.2 Model estimation

We fit the model for each sex and age bucket.

fit <- lapply(daily_data, function(x)
{
  fit_dlnm(x, param_dlnm_whole, per_age = T, summer = F, psi = NULL)
}
)

3.3 Results and diagnostics

Based on Antonio Gasparrini and Leone (2014), we calculate the temperature-attributable fraction for day \(d\) as \[ \widehat{\text{AF}}^{(g)}_{k,t,d} = 1-\exp{\left(- \sum_{l=0}^{L}{f\cdot w(\vartheta_{d,l}, l;\widehat{\boldsymbol{\theta}}_k^{(g)})} \right)}. \] Then, we deduce the death counts attributable to temperatures for each day of the year \(d\), and age \(x \in X_k\), \(k \in \left\lbrace1, \ldots, K\right\rbrace\), using the formula \[ \widehat{\bar{D}}_{x,t,d}^{(g)} = \widehat{\text{AF}}^{(g)}_{k,t,d} \times \sum_{l=0}^{L} \frac{D_{x,t,d+l}^{(g)}}{L+1}. \] The term \(\exp{\left(s(\vartheta_{d,t}, L; \widehat{\boldsymbol{\theta}}_k^{(g)} )\right)}\) corresponds to the cumulative RR of temperature mortality. We display below the cumulative RR curves for temperatures.

summary_dlnm_all_sex(fit)

For analyzing the goodness of fit performance of our models, we compute the estimated daily death counts for each day \(d\) of year \(t\), which is given by \[ \widehat{D}_{k,t,d}^{(g)} = \exp{\left(s(\vartheta_{d,t}, L; \widehat{\boldsymbol{\theta}}_k^{(g)} )\right)} \exp{\left(\widehat{\eta}_k^{(g)} + z_{d,1} \widehat{\zeta}_{k,1}^{(g)} + \sum_{t \in \mathcal{T}_y}{h_t( z_{t,p} ; \widehat{\boldsymbol{\zeta}}_{k,t}^{(g)})}\right)}. \] Based on these predicted counts, we display the residuals of the Deviance by year and age group, and compare the distribution of observed (in blue) and predicted (in green) death counts for each month, distinguishing by sex and decade for females and males.

# in-sample performance of the fitted model for the whole year.
perf_mort <- lapply(names(daily_data), function(x){
  perf_dnlm(fit[[x]])
})
names(perf_mort) <- names(daily_data)

3.3.1 Females

3.3.2 Males

3.4 Estimation of the temperature- attributable fractions

The model’s setup allows for a range of predictions that precisely gauge the influence of heat, cold, or specific events occurring on each day of subset \(\mathcal{D}_t \subseteq \mathcal{D}^\star\) of year \(t\).

The sum of the contributions from each day of this subperiod \(\mathcal{D}_t\) for a year \(t \in \mathcal{T}_y\) is as follow \[ \widehat{\bar{D}}_{x,t}^{(g)} = \sum_{d \in \mathcal{D}_t}{\widehat{\bar{D}}_{x,t,d}^{(g)}\textbf{1}_{\left\lbrace d \in \mathcal{D}_t\right\rbrace }}. \]

Thus, the total attributable fraction is estimated for each age \(x \in X_k\) and year \(t \in \mathcal{T}_y\) as \[ \widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t)= \dfrac{\widehat{\bar{D}}_{x,t}^{(g)}}{\sum_{d \in \mathcal{D}_t} D_{x,t,d}^{(g)}}, \]

We compute the temperature-attributable fractions \(\widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t)\) for both women and men, and aggregate them over all ages to facilitate visualization on the figure below. The estimation error is calculated based on 1000 replications of parametric bootstraps.

# Run excess of mortality
xs_mort <- lapply(names(daily_data), function(x){
  res <- excess_mort_dlnm(daily_data[[x]], fit[[x]], q_range = c(q025, q975),
                          nsim = nsim, parallel = parallel, ncpus = ncpus)
  return(res)
  })
names(xs_mort) <- names(daily_data)

# arrange data
xs_data <- lapply(names(xs_mort), function(i){
  xs_mort[[i]]$excess_mort %>%
    dplyr::mutate(gender = i) %>%
    rename(year = years)
})
xs_data <- do.call("rbind", xs_data)

# print attributable fractions
summary_temp_effect(xs_data)

4 The Li-Lee model

We decomposition the number of deaths and death rates into two components \[ D_{x,t}^{(g)} = \widetilde{D}_{x,t}^{(g)} + \bar{D}_{x,t}^{(g)} \Rightarrow \widehat{m}_{x,t}^{(g)} = \widetilde{m}_{x,t}^{(g)} + \bar{m}_{x,t}^{(g)}, \] where \(\bar{D}_{x,t}^{(g)}\) and \(\widetilde{D}_{x,t}^{(g)}\) are respectively the number of deaths attributable and non attributable to temperature effects.

Then, we consider the two-populations Li and Lee (2005) model for the crude central death rates non attributable to temperature effects as \[ \ln \left(\widetilde{m}_{x,t}^{\left(g\right)} \right)= A_x + B_{x}K_{t} + \alpha_{x}^{\left( g \right)} + \beta_{x}^{\left( g \right)} \kappa_{t}^{\left( g \right)}. \]
We use the following specifications:

  • Identifiability constraints

\[ \sum\limits_{t \in \mathcal{T}_y} K_{t} = 0 \text{ and } \sum\limits_{x \in \mathcal{X}} B_{x}^2 = 1, \] \[ \sum\limits_{t \in \mathcal{T}_y} \kappa_{t}^{\left( g \right)} = 0 \text{ and } \sum\limits_{x \in \mathcal{X}} (\beta_{x}^{\left( g \right)})^2 = 1, \text{ for } g \in \mathcal{G} \]

  • Time series model with coherence assumption

\[ K_{t} = \delta + K_{t-1} + e_{t} \;\; (\text{RWD with drift}) \] \[ \kappa_{t}^{\left( g \right)} = c^{\left( g \right)} + \phi^{\left( g \right)} \kappa_{t-1}^{\left( g \right)} + r_{t}^{\left( g \right)} \;\; (\text{AR(1) with drift and } \vert \phi^{(g)} \vert <1). \] Errors terms are white noises with a mean of zero and a variance-covariance matrix \(\boldsymbol{\Sigma}\).

4.1 Fitting the Poisson regression model

For calibrating the parameters \(({A}_x, {B}_x, {K}_{t}, {\alpha}_{x}^{(f)}, {\beta}_{x}^{(f)}, {\kappa}_{t}^{(f)}, {\alpha}_{x}^{(m)}, {\beta}_{x}^{(m)}, {\kappa}_{t}^{(m)})\), we use a Poisson assumption for the number of virtual deaths as in Brouhns, Denuit, and Vermunt (2002)

\[ \widetilde{D}_{x,t}^{(g)} \sim \text{Pois}\left( E_{x,t}^{(g)} \widetilde{m}_{x,t}^{(g)}\right). \]
Thus, with the attributable fraction \(\text{AF}^{(g)}_{x,t}\), we also have a Poisson formulation with log-link function for the overall number of deaths \[ {D}_{x,t}^{(g)} \sim \text{Pois}\left( E_{x,t}^{(g)} T_{x,t}^{(g)} \widetilde{m}_{x,t}^{(g)}\right), \] where \(T_{x,t}^ {(g)} = \left( 1 - \text{AF}^{(g)}_{x,t}\right)^{-1}\) can be consider into the offset term.

4.1.1 Calculating the offset term based on the attributable fraction

First, we define the mortality model hyper-parameters, i.e. the age range, the observation period, and the name of each group.

xv <- 0:105
yv <- 1980:2019
countries <- c("FRATNP")
group <- c("Female", "Male", "Total")

Then, we select all_effect (all the effects of temperatures over the year) and assume that the offset term \(T_{x,t}^{(g)}\) is constant for all ages in each bucket.

# Data for women and men
xs_data <- get(load(file =  paste0(fold_data, "/xs_data.RData")))

xs_f <- xs_data %>% 
  dplyr::filter(gender == "f", temp_effect  == "all_effect") %>%
  dplyr::select(year, age_bk, ajust_factor)

xs_m <- xs_data %>% 
  dplyr::filter(gender == "m", temp_effect  == "all_effect") %>%
  dplyr::select(year, age_bk, ajust_factor)

# Transformation to matrix
transf_attrib_matrix <- function(df){
  mat <- matrix(data = 1, ncol = length(xv), nrow = length(yv))
  for(tt in yv){
    mat[tt - yv[1] + 1, 1:65] <- dplyr::filter(df, year == tt,
                                               age_bk == "0-64")$ajust_factor
    mat[tt - yv[1] + 1, 66:75] <- dplyr::filter(df, year == tt,
                                                age_bk == "65-74")$ajust_factor
    mat[tt - yv[1] + 1, 76:85] <- dplyr::filter(df, year == tt,
                                                age_bk == "75-84")$ajust_factor
    mat[tt - yv[1] + 1, 86:(max(xv) + 1)] <- dplyr::filter(
      df, year == tt, age_bk == "85+")$ajust_factor
  }
dimnames(mat) <- list(yv, xv)
return(mat)
}

xs_f <- transf_attrib_matrix(xs_f)
xs_m <- transf_attrib_matrix(xs_m)

4.1.2 Adjusting exposure to risk

Now, we multiply the offset term \(T_{x,t}^{(g)}\) by the exposure to risk. Subsequently, we construct a new dataset that we use to fit the Li-Lee model.

# Create datasets for deaths with ajusted exposure
adj_mort_dt <- list(
  UNI = list(
    f = list(dtx = round(mort_dt$UNI$f$dtx),
             etx = mort_dt$UNI$f$etx * xs_f,
             wa = mort_dt$UNI$f$wa),
    m = list(dtx = round(mort_dt$UNI$m$dtx),
             etx = mort_dt$UNI$m$etx * xs_m,
             wa = mort_dt$UNI$m$wa)),
  ALL = list(dtx = mort_dt$ALL$dtx,
             etx = mort_dt$ALL$etx,
             wa = mort_dt$ALL$wa)
)
## replace exposure to adjusted exposure
adj_mort_dt$ALL$etx <- adj_mort_dt$UNI$f$etx + adj_mort_dt$UNI$m$etx

4.1.3 Model calibration

We calibrate the parameters \(({A}_x, {B}_x, {K}_{t}, {\alpha}_{x}^{(f)}, {\beta}_{x}^{(f)}, {\kappa}_{t}^{(f)}, {\alpha}_{x}^{(m)}, {\beta}_{x}^{(m)}, {\kappa}_{t}^{(m)})\) of the Li-Lee model using both the original and adjusted exposure to risk. This allows to assess the impact of temperature-attributable deaths on these estimated parameters.

These models are estimated by maximizing the Poisson log-likelihood. As detailed in the paper, our procedure consists in two steps. First, we estimate the common parameters \(({A}_x, {B}_x, {K}_{t})\), followed by the estimation of sex-specific parameters. This estimation is performed using the R-package MultiMoMo (Antonio, Devriendt, and Robben 2022).

# Models with the original exposure to risk
fit_M <- fit_li_lee(xv, yv, yv, "m", data = mort_dt, method = "NR",
                    Ax = TRUE, exclude_coh = FALSE)
fit_F <- fit_li_lee(xv, yv, yv, "f", data = mort_dt, method = "NR",
                    Ax = TRUE, exclude_coh = FALSE)
# Models with the adjusted exposure to risk
adj_fit_M <- fit_li_lee(xv, yv, yv, "m", data = adj_mort_dt, method = "NR",
                        Ax = TRUE, exclude_coh = FALSE)
adj_fit_F <- fit_li_lee(xv, yv, yv, "f", data = adj_mort_dt, method = "NR",
                        Ax = TRUE, exclude_coh = FALSE)

The calibrated parameters are displayed as follow.

list_fit <- list(fit_F = fit_F,
                 fit_M = fit_M,
                 adj_fit_F = adj_fit_F,
                 adj_fit_M = adj_fit_M)
# print figures
myplot_parameters_li_lee(xv, yv, list_fit)

4.1.4 Goodness of fit

We now display residuals for the model with exposures adjusted of temperature effects (model with original exposure provides similar results).

fig_res_f <- plot_residuals_li_lee(xv, yv, adj_fit_F, "FR", "Female")
fig_res_m <- plot_residuals_li_lee(xv, yv, adj_fit_M, "FR", "Male")

plot_grid(fig_res_f + ggtitle("Females"), 
          fig_res_m + ggtitle('Males'),
          label_size = 12)

4.2 Calibrating and forecasting the time series model

In this section, we focus on calibrating the time series model, rewritten in matrix form as \[ \boldsymbol{Y}_{t} = \boldsymbol{\Upsilon}+ \boldsymbol{\Phi}\boldsymbol{Y}_{t-1} + \boldsymbol{E}_t, \] where \[ \boldsymbol{Y}_{t} = \begin{pmatrix} K_t \\ \kappa_t^{(f)} \\ \kappa_t^{(m)} \end{pmatrix}, \boldsymbol{\Upsilon}= \begin{pmatrix} \delta \\ c^{(f)} \\ c^{(m)} \end{pmatrix}, \boldsymbol{\Phi}= \begin{pmatrix} 1 & 0 & 0\\ 0 & \phi^{(f)} & 0\\ 0 & 0 & \phi^{(m)}\\ \end{pmatrix} \text{ and } \boldsymbol{E}_t = \begin{pmatrix} e_t \\ r_t^{(f)} \\ r_t^{(m)} \end{pmatrix}. \]

We calibrate this model using two sets of parameters for \(\boldsymbol{Y}_{t}\), i.e.  the parameters obtained with the original exposure to risk and those obtained with the temperauture-adjusted exposure to risk. These parameters are also estimated through maximum likelihood using R-package MultiMoMo.

4.2.1 Specification

We consider the projection period from the last year of the calibration set to \(H\), the projection horizon.

## Forecasting Parameters
arima_spec <- list(K.t_M = "RWD", k.t_M = "AR1.1", k.t_F = "AR1.1")
n_ahead    <- length(start_year:end_year)
n_sim      <- nsim
est_method <- "PORT"

4.2.2 Calibration and projection

In this section, we estimate the parameters \(\boldsymbol{\Upsilon}\), \(\widehat{\Phi}\), and \(\widehat{\boldsymbol{\Sigma}}\). Then, we forecast \(\boldsymbol{Y}_t\) by simulating the innovation errors \(\boldsymbol{E}_t\) from a multivariate Gaussian vector with zero mean and covariance matrix \(\widehat{\boldsymbol{\Sigma}}\). In this application, 1000 Monte Carlo simulations is considered.

set.seed(1234)
# Model with original exposures
proj_par <- project_parameters(fit_M, fit_F, n_ahead, n_sim,
                               arima_spec, est_method)
# Model with adjusted exposures
adj_proj_par <- project_parameters(adj_fit_M, adj_fit_F, n_ahead, n_sim,
                                   arima_spec, est_method)

We verify that the series modeled by an AR(1) process are stationary, i.e. parameter estimates \(\widehat{k}_t^{(g)}\) are smaller than 1.

tab_param <- data.frame(
  v1 = c("Original Li-Lee model", "Adjusted exposure to risk"),
  v2 = c(proj_par$coef_KtM, adj_proj_par$coef_KtM),
  v3 = c(proj_par$coef_ktM[1], adj_proj_par$coef_ktM[1]),
  v4 = c(proj_par$coef_ktM[2], adj_proj_par$coef_ktM[2]),
  v5 = c(proj_par$coef_ktF[1], adj_proj_par$coef_ktF[1]),
  v6 = c(proj_par$coef_ktF[2], adj_proj_par$coef_ktF[2])
)
# Print table
tab_param %>%
  kbl( booktabs = T,
       align='c',
       col.names = c("Calibration data", "$\\delta$",
                     "$c^{(m)}$", "$\\phi^{(m)}$",
                     "$c^{(f)}$", "$\\phi^{(f)}$"),
       digits=4, 
       escape =  FALSE) %>%
  kable_classic_2(full_width = F) %>%
  kable_styling(latex_options = "hold_position",  font_size = 8)
Calibration data \(\delta\) \(c^{(m)}\) \(\phi^{(m)}\) \(c^{(f)}\) \(\phi^{(f)}\)
Original Li-Lee model -0.2340 -0.0066 0.9692 -0.0266 0.9963
Adjusted exposure to risk -0.2341 -0.0045 0.9697 -0.0256 0.9999

The estimated covariance matrix \(\boldsymbol{C}\) is given as follow.

# Create correction matrix
corr_function <- function(S)
  {
    D <- diag(sqrt(diag(S)))
    Dinv <- solve(D)
    R <- Dinv %*% S %*% Dinv
    dimnames(R) <- dimnames(S)
    R 
  }

# Show correction matrix in table
tab_corr <- cbind(data.frame(corr_function(proj_par$cov_mat)), 
                  data.frame(corr_function(adj_proj_par$cov_mat)))
names(tab_corr) <- paste0("v", 1:ncol(tab_corr))
row.names(tab_corr) <- NULL

# Print table
tab_corr %>%
  dplyr::mutate(v = c("$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$"),
                .before = "v1") %>%
  kbl(booktabs = T,
       align='c',
       col.names = c("", "$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$",
                     "$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$"),
      digits=4, 
      escape =  FALSE) %>%
  add_header_above(c(" " = 1, "Original Li-Lee model" = 3,
                     "Adjusted exposure to risk" = 3)) %>%
  kable_classic_2(full_width = F) %>%
  kable_styling(latex_options = "hold_position",  font_size = 8)
Original Li-Lee model
Adjusted exposure to risk
\(e_t\) \(r_t^{(m)}\) \(r_t^{(f)}\) \(e_t\) \(r_t^{(m)}\) \(r_t^{(f)}\)
\(e_t\) 1.0000 -0.7393 0.9449 1.0000 -0.5648 0.9330
\(r_t^{(m)}\) -0.7393 1.0000 -0.7152 -0.5648 1.0000 -0.5447
\(r_t^{(f)}\) 0.9449 -0.7152 1.0000 0.9330 -0.5447 1.0000

4.2.3 Projecting parameters

We present the projection of parameters \(\widehat{K}_t\), \(\widehat{\kappa}_t^{(f)}\), and \(\widehat{\kappa}_t^{(m)}\) over the period 2020-2100 with the \(95\%\) prediction intervals. These projections are shown for both Li-Lee models with and without temperature effects.

p <- plot_kt_sim(proj_par, adj_proj_par, n_sim, yv[1], end_year)
print(p)

5 Multi-mortality and climate models

Now that death rates not attributable to temperature effects can be projected, we focus on adding mortality shocks resulting from daily temperature variations. To achieve this, we consider a temperature trajectory derived from an external climate model \((\vartheta_{d,t})\), and then project the central death rates with temperature effects accumulated over period \(\mathcal{D}_t\) as \[ \widehat{m}_{x,t}^{(g)} = {\widetilde{m}}_{x,t}^{(g)} \left[ 1 + \sum_{d \in \mathcal{D}_t}{ \omega_{x,t,d}^{(g)} \text{AF}_{x,d,t}^{(g)} (1- \text{AF}_{x,d,t}^{(g)})^{-1}\boldsymbol{1}_{\left\lbrace d \in \mathcal{D}_t \right\rbrace }}\right]. \] where \(\omega_{x,t,d}^{(g)} =\widetilde{D}_{x,t,d}^{(g)}/\widetilde{D}_{x,t}^{(g)}\) corresponds to the weight associated with the distribution of death counts not attributable to temperatures over the period \(\mathcal{D}_t\) of year \(t\).

5.1 Projecting future mortality

Now, we simulate the projected all-cause death rates and the projected all-cause death rates not attributable to temperatures. \[ \widehat{q}_{x,t}^{(g)} = 1 - \exp{\left(-\widehat{\widehat{m}}_{x,t}^{(g)}\right)}, \quad \widehat{\widetilde{q}}_{x,t}^{(g)} = 1 - \exp{\left(\widehat{\widetilde{m}}_{x,t}^{(g)}\right)}, \]

For each simulation, we also compute the number of years of life expectancy lost (or gained) due to temperatures \[ \Delta \widehat{e}_{x,t}^{(g)} = \sum_{k=1}^{t_{\max}}{\left[ \prod_{j=0}^{k-1}{\left(1-\widehat{\widetilde{q}}_{x,j}^{(g)}\right)} - \prod_{j=0}^{k-1}{\left(1-\widehat{q}_{x,j}^{(g)}\right)} \right]}. \]

The first step is to simulate the death rates \(\hat{m}_{x,t}^{(g)}\) and \(\widetilde{m}_{x,t}^{(g)}\) derived respectively from the original Li-Lee model and the model with adjusted exposure to risk.

## project mortality rates considering historical temperature effect 
proj_rates <- project_mortality_rates(fit_M, fit_F, proj_par)

## project mortality rates considering without temperature effect 
adj_proj_rates <- project_mortality_rates(adj_fit_M, adj_fit_F, adj_proj_par)

# reformat data
format_rates <- function(rates)
{
  rates <- melt(rates)
  names(rates) <- c("age", "year", "sim", "Qxt")
  rates <- rates %>%
    mutate(age_bk = cut(age, age_breaks, age_labels, include.lowest = T))
}

proj_rates <- lapply(proj_rates, format_rates)
names(proj_rates) <- c("m", "f")

adj_proj_rates <- lapply(adj_proj_rates, format_rates)
names(adj_proj_rates) <- c("m", "f")

Second, we compute the weight associated with the distribution of death counts not attributable to temperatures over the period \(\mathcal{D}_t\) of year \(t\) \[ \omega_{x,t,d}^{(g)} =\widetilde{D}_{x,t,d}^{(g)}/\widetilde{D}_{x,t}^{(g)}. \]

We compute this weight as an average over the last 5 years.

# Run excess of mortality
xs_mort <- lapply(names(daily_data), function(x){
  res <- excess_mort_dlnm(daily_data[[x]], fit[[x]], q_range = c(q025, q975),
                          nsim = 1, parallel = "no", ncpus = ncpus)
  return(res)
  })
names(xs_mort) <- names(daily_data)

# Compute weights
un_an_weight <- lapply(names(xs_mort), function(x){
  # Add day and months
  # Filter on the last five years
  res <- xs_mort[[x]]$excess_mort %>%
    mutate(day = day(datedec),
           month = month(datedec)) %>%
    filter(years %in% c(2015:2019), temp_effect == "all_effect") %>%
    mutate(un_an = cases - an) %>%
    dplyr::select(years, age_bk, un_an, day, month) %>%
    group_by(age_bk, day, month) %>%
    summarise(un_an = sum(un_an))
  
  # Calculate the denominator of the weight
  xs_mort_agg <- res %>%
    group_by(age_bk) %>%
    summarise(un_an_sum = sum(un_an))
  
  res <- res %>%
    left_join(xs_mort_agg, by = "age_bk") %>%
    mutate(weight = un_an / un_an_sum) %>%
    mutate(un_an = NULL,
           un_an_sum = NULL)
  # Correct 29-02
  res$weight[res$day==29 & res$month==2] <- 5 * 
    res$weight[res$day==29 & res$month==2]
  
  return(res)
})
names(un_an_weight) <- names(xs_mort)

Next, we apply the projected additional mortality related to future temperature. Estimation errors related to the DLMN model is considered with 1000 bootstrap simulations.

It is worth noting that it is also possible to simultaneously consider temperature simulations from multiple climate models to capture this source of uncertainty. However, we do not include this step in this Notebook to simplify the presentation. Implementing this approach is not difficult and is been discussed in the paper.

# Change names dates
clim_model <- clim_model %>% dplyr::rename(datedec = date)
# Select the forecasting period
sc <- dplyr::filter(clim_model, years %in% start_year:end_year)

start_time <- Sys.time()
# Run simulations for males and females
res <- lapply(names(adj_proj_rates), function(x)
{
  # Select population
  temp_dem <- adj_proj_rates[[x]] %>%
    filter(year %in% start_year:end_year)
  # select weight
      temp_weight <- un_an_weight[[x]]
    # Launch forecasting along the rcp scenario with simulations
  # and parallelization
  f_mort <- forecast_mort_dlnm(temp_dem, sc, fit[[x]], temp_weight,
                               q_range = c(q025, q975),
                               sensi_cen = 0,
                               nsim = nsim, parallel = parallel,
                               ncpus = ncpus)
  # Add sex indexes
  f_mort <- lapply(f_mort, function(tt){
    if(is.null(tt))
    {
      return(tt)
    } else{
      return(
        tt %>%
          dplyr::mutate(gender = x) %>%
          relocate(gender, temp_effect, .before =  sim)
      )
    }
  })
  return(f_mort)
})

# Merge more than two lists with the same element name
res <- Reduce(function(...) Map("rbind", ...), res)

# Save and printtime after operation
end_time <- Sys.time()
time_diff <- end_time - start_time
print(paste("Run time:", as.numeric(time_diff, units = "mins"), "mins"))
## [1] "Run time: 591.297907698154 mins"

5.2 Dislaying attributable fraction

In the following figures, we display the projected attributable fraction related to future temperatures in the selected climate scenario.

To facilitate visual analysis, we calculate an aggregated attributable fraction to temperatures using the distribution of deaths known at the end of 2019, as follows \[ \widehat{\text{AF}}_{t} (\mathcal{D}_t) = \sum_{g \in \mathcal{G}} { \sum_{x \in \mathcal{X}}{\widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t) \frac{D_{x, 2019}^{(g)}}{D_{2019}} }}, \]

where \(D_{2019} = \sum_{g \in \mathcal{G}}{\sum_{x \in \mathcal{X}}{D_{x, 2019}^{(g)}}}.\)

# extract the last year 2019, for calculate exposure to risk
expo <- do.call("rbind", lapply(mort_dt$UNI, function(dt){
  dt <- dt$dtx[nrow(dt$dtx),  ]
  return(data.frame(age = 0:105,
                    w = dt,
                    age_bk = c(rep("0-64", 65), rep("65-74", 10),
                               rep("75-84", 10), rep("85+", 21))
                    ))
}))
expo$gender <- c(rep("f", 106), rep("m", 106))
# Agregate expo per age bucket
W <- sum(expo$w)
expo <- expo %>%
  group_by(age_bk, gender) %>%
  summarise(w = sum(w) / W)
# Specify the RCP number
res$tab_excess$traj_clim <- "ALADIN63_CNRM-CM5"
res$tab_excess$rcp <- "8.5"

# Weighted attributed fraction
plot_aff <- summary_forecast_temp_effect(res$tab_excess, expo, var = "af_weight_year")
# Attributed fraction with constant weight
plot_aff_no_weight <- summary_forecast_temp_effect(res$tab_excess, expo, var = "af_no_weight_year")

# Plot aggregate temperature attributed fraction (with weights)
plot_aff[[1]]

p.comb <- plot_aff[[2]]

# Plot temperature attributed fraction for each age group (with weights)
p.comb[[1]]

p.comb[[2]]

p.comb[[3]]

p.comb[[4]]

5.3 Assessing the loss in life expectancy

Finally, we assess the loss in life expectancy due to future temperatures.

res$tab_ex$traj_clim <- "ALADIN63_CNRM-CM5"
res$tab_ex$rcp <- "8.5"
plot_ev <- summary_forecast_ev_effect(res$tab_ex)

# Plot LE including temperature effects (with weights)
plot_ev[[1]]

# Plot lost in LE including temperature effects  (with weights)
plot_ev[[2]]

6 Conclusion

This Notebook illustrates the implementation of a multi-population mortality model incorporating the effect of temperature changes on mortality. The construction of this framework is simplified by omitting multiple scenarios and various climate models. Therefore, it does not exactly reproduce the results presented in the paper.

References

Antonio, Katrien, Sander Devriendt, and Jens Robben. 2022. The MultiMoMo package.” url: https://github.com/jensrobben/MultiMoMo.
Brouhns, N., M. Denuit, and J. K. Vermunt. 2002. “A Poisson Log-Bilinear Regression Approach to the Construction of Projected Lifetables.” Insurance: Mathematics and Economics 31 (3). https://doi.org/10.1016/S0167-6687(02)00185-3.
Gasparrini, A. 2011. “Distributed Lag Linear and Non-Linear Models in R: The Package dlnm.” Journal of Statistical Software 43 (8): 1–20. https://doi.org/10.18637/jss.v043.i08.
Gasparrini, A, B Armstrong, and M G Kenward. 2010. “Distributed Lag Non-Linear Models.” Statistics in Medicine 29 (21): 2224–34. https://doi.org/10.1002/sim.3940.
Gasparrini, Antonio. 2014. “Modeling Exposure-Lag-Response Associations with Distributed Lag Non-Linear Models.” Statistics in Medicine 33 (5): 881–99. https://doi.org/10.1002/sim.5963.
Gasparrini, Antonio, and Michela Leone. 2014. “Attributable Risk from Distributed Lag Models.” BMC Medical Research Methodology 14 (1): 55. https://doi.org/10.1186/1471-2288-14-55.
Li, Nan, and Ronald Lee. 2005. “Coherent Mortality Forecasts for a Group of Populations: An Extension of the Lee-Carter Method.” Demography 42 (3): 575–94. https://doi.org/10.1353/dem.2005.0021.